Search CORE

778 research outputs found

The speciation of the proteome

Author: Apweiler Rolf
Holzhütter Hermann G
Jungblut Peter R
Schlüter Hartmut
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Introduction In proteomics a paradox situation developed in the last years. At one side it is basic knowledge that proteins are post-translationally modified and occur in different isoforms. At the other side the protein expression concept disclaims post-translational modifications by connecting protein names directly with function. Discussion Optimal proteome coverage is today reached by bottom-up liquid chromatography/mass spectrometry. But quantification at the peptide level in shotgun or bottom-up approaches by liquid chromatography and mass spectrometry is completely ignoring that a special peptide may exist in an unmodified form and in several-fold modified forms. The acceptance of the protein species concept is a basic prerequisite for meaningful quantitative analyses in functional proteomics. In discovery approaches only top-down analyses, separating the protein species before digestion, identification and quantification by two-dimensional gel electrophoresis or protein liquid chromatography, allow the correlation between changes of a biological situation and function. Conclusion To obtain biological relevant information kinetics and systems biology have to be performed at the protein species level, which is the major challenge in proteomics today.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Improved Core Genes Prediction for Constructing well-supported Phylogenetic Trees in large sets of Plant Species

Author: B. Alkindy
B. Stoebe
D. Grzebyk
M. Chiara De
M. Kearse
N. Chaffey
N. Zafar
R. Apweiler
S.K. Wyman
V. Ranwez
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The way to infer well-supported phylogenetic trees that precisely reflect the evolutionary process is a challenging task that completely depends on the way the related core genes have been found. In previous computational biology studies, many similarity based algorithms, mainly dependent on calculating sequence alignment matrices, have been proposed to find them. In these kinds of approaches, a significantly high similarity score between two coding sequences extracted from a given annotation tool means that one has the same genes. In a previous work article, we presented a quality test approach (QTA) that improves the core genes quality by combining two annotation tools (namely NCBI, a partially human-curated database, and DOGMA, an efficient annotation algorithm for chloroplasts). This method takes the advantages from both sequence similarity and gene features to guarantee that the core genome contains correct and well-clustered coding sequences (\emph{i.e.}, genes). We then show in this article how useful are such well-defined core genes for biomolecular phylogenetic reconstructions, by investigating various subsets of core genes at various family or genus levels, leading to subtrees with strong bootstraps that are finally merged in a well-supported supertree.Comment: 12 pages, 7 figures, IWBBIO 2015 (3rd International Work-Conference on Bioinformatics and Biomedical Engineering

arXiv.org e-Print Archive

Crossref

InterProScan: protein domains identifier

Author: Apweiler R.
Harte N.
Lopez R.
Mulder N.
Pillai S.
Quevillon E.
Silventoinen V.
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

InterProScan [E. M. Zdobnov and R. Apweiler (2001) Bioinformatics, 17, 847–848] is a tool that combines different protein signature recognition methods from the InterPro [N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bradley, P. Bork, P. Bucher, L. Cerutti et al. (2005) Nucleic Acids Res., 33, D201–D205] consortium member databases into one resource. At the time of writing there are 10 distinct publicly available databases in the application. Protein as well as DNA sequences can be analysed. A web-based version is accessible for academic and commercial organizations from the EBI (). In addition, a standalone Perl version and a SOAP Web Service [J. Snell, D. Tidwell and P. Kulchenko (2001) Programming Web Services with SOAP, 1st edn. O'Reilly Publishers, Sebastopol, CA, ] are also available to the users. Various output formats are supported and include text tables, XML documents, as well as various graphs to help interpret the results

Crossref

PubMed Central

The Ontology Lookup Service: more data and better tools for controlled vocabulary queries

Author: Cote
H. Hermjakob
Hull
L. Martens
Orchard
P. Jones
R. Apweiler
R. G. Cote
Smith
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

The Ontology Lookup Service (OLS) (http://www.ebi.ac.uk/ols) provides interactive and programmatic interfaces to query, browse and navigate an ever increasing number of biomedical ontologies and controlled vocabularies. The volume of data available for querying has more than quadrupled since it went into production and OLS functionality has been integrated into several high-usage databases and data entry tools. Improvements have been made to both OLS query interfaces, based on user feedback and requirements, to improve usability and service interoperability and provide novel ways to perform queries

CiteSeerX

Crossref

Ghent University Academic Bibliography

PubMed Central

Recommended from our members

Cellular resolution models for even skipped regulation in the entire Drosophila embryo

Author: Apweiler Rolf
DePace Angela H
Fisher Jasmin
Ilsley Garth R
Luscombe Nicholas M
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 01/03/2014
Field of study

Transcriptional control ensures genes are expressed in the right amounts at the correct times and locations. Understanding quantitatively how regulatory systems convert input signals to appropriate outputs remains a challenge. For the first time, we successfully model even skipped (eve) stripes 2 and 3+7 across the entire fly embryo at cellular resolution. A straightforward statistical relationship explains how transcription factor (TF) concentrations define eve’s complex spatial expression, without the need for pairwise interactions or cross-regulatory dynamics. Simulating thousands of TF combinations, we recover known regulators and suggest new candidates. Finally, we accurately predict the intricate effects of perturbations including TF mutations and misexpression. Our approach imposes minimal assumptions about regulatory function; instead we infer underlying mechanisms from models that best fit the data, like the lack of TF-specific thresholds and the positional value of homotypic interactions. Our study provides a general and quantitative method for elucidating the regulation of diverse biological systems. DOI: http://dx.doi.org/10.7554/eLife.00522.00

Harvard University - DASH

E-MSD: an integrated data resource for bioinformatics

Author: Apweiler R.
Barrell D.
Henrick K.
McNeil P.
Mittard-Runte V.
Suarez A.
Velankar S.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the ‘Structure Integration with Function, Taxonomy and Sequences (SIFTS)’ initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group

Crossref

PubMed Central

The GOA database in 2009—an integrated Gene Ontology Annotation resource

Author: C. O'Donovan
Camon
D. Barrell
D. Binns
E. Dimmer
Gattiker
Kersey
Lomax
Lovering
Mulder
R. Apweiler
R. P. Huntley
Thomas
Yon Rhee
Publication venue: Oxford University Press
Publication date: 29/10/2008
Field of study

The Gene Ontology Annotation (GOA) project at the EBI (http://www.ebi.ac.uk/goa) provides high-quality electronic and manual associations (annotations) of Gene Ontology (GO) terms to UniProt Knowledgebase (UniProtKB) entries. Annotations created by the project are collated with annotations from external databases to provide an extensive, publicly available GO annotation resource. Currently covering over 160 000 taxa, with greater than 32 million annotations, GOA remains the largest and most comprehensive open-source contributor to the GO Consortium (GOC) project. Over the last five years, the group has augmented the number and coverage of their electronic pipelines and a number of new manual annotation projects and collaborations now further enhance this resource. A range of files facilitate the download of annotations for particular species, and GO term information and associated annotations can also be viewed and downloaded from the newly developed GOA QuickGO tool (http://www.ebi.ac.uk/QuickGO), which allows users to precisely tailor their annotation set

Crossref

PubMed Central

UCL Discovery

Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads

Author: Bentley
Iafrate
K. Ye
Kidd
Levy
M. H. Schulz
Ning
Q. Long
R. Apweiler
Schulz
Sebat
Wheeler
Z. Ning
Publication venue: Oxford University Press
Publication date: 26/06/2009
Field of study

Motivation: There is a strong demand in the genomic community to develop effective algorithms to reliably identify genomic variants. Indel detection using next-gen data is difficult and identification of long structural variations is extremely challenging

Crossref

PubMed Central

MPG.PuRe

Enabling comparative modeling of closely related genomes: Example genus Brucella

Author: AR Wattam
AR Wattam
B Boeckmann
CS Henry
I Thiele
JD Orth
JJ Davis
JJ Gillespie
K Lagesen
K Tanaka
KD Pruitt
L Li
R Apweiler
R Overbeek
R Overbeek
R Overbeek
R Overbeek
RD Fleischmann
RK Aziz
SF Altschul
VM Markowitz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.We thank Jean Jacques Letesson, Maite Iriarte, Stephan Kohler and David O'Callaghan for their input on improving specific annotations. This project has been funded by the United States National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272200900040C, awarded to BW Sobral, and from the United States National Science Foundation under Grant MCB-1153357, awarded to CS Henry. J.P.F. acknowledges funding from [FRH/BD/70824/2010] of the FCT (Portuguese Foundation for Science and Technology) Ph.D. scholarship

Universidade do Minho: RepositoriUM

Crossref

Springer - Publisher Connector

PubMed Central

VarySysDB: a human genetic polymorphism database based on all H-InvDB transcripts

Author: Apweiler
C. Gough
C. Yamasaki
Hamosh
Iafrate
Imanishi
Kawabata
M. K. Shimada
R. Matsumoto
R. Sanbonmatsu
Seng
Sherry
T. Gojobori
T. Imanishi
The InterPro Consortium
Y. Hayakawa
Y. Yamaguchi-Kabata
Publication venue: Oxford University Press
Publication date
Field of study

Creation of a vast variety of proteins is accomplished by genetic variation and a variety of alternative splicing transcripts. Currently, however, the abundant available data on genetic variation and the transcriptome are stored independently and in a dispersed fashion. In order to provide a research resource regarding the effects of human genetic polymorphism on various transcripts, we developed VarySysDB, a genetic polymorphism database based on 187 156 extensively annotated matured mRNA transcripts from 36 073 loci provided by H-InvDB. VarySysDB offers information encompassing published human genetic polymorphisms for each of these transcripts separately. This allows comparisons of effects derived from a polymorphism on different transcripts. The published information we analyzed includes single nucleotide polymorphisms and deletion–insertion polymorphisms from dbSNP, copy number variations from Database of Genomic Variants, short tandem repeats and single amino acid repeats from H-InvDB and linkage disequilibrium regions from D-HaploDB. The information can be searched and retrieved by features, functions and effects of polymorphisms, as well as by keywords. VarySysDB combines two kinds of viewers, GBrowse and Sequence View, to facilitate understanding of the positional relationship among polymorphisms, genome, transcripts, loci and functional domains. We expect that VarySysDB will yield useful information on polymorphisms affecting gene expression and phenotypes. VarySysDB is available at http://h-invitational.jp/varygene/

Crossref

PubMed Central